Asynchronous Distributed Gibbs Sampling (Preprint Version 0.1)

نویسندگان

Alexander Terenin

David Draper

چکیده

Gibbs sampling is a Markov Chain Monte Carlo (MCMC) method for numerically approximating integrals of interest in Bayesian statistics and other mathematical sciences. Since MCMC methods typically suffer from poor scaling when the integral in question is high-dimensional (for example, in problems in Bayesian statistics involving large data sets), researchers have attempted to find ways to speed up computation. We present a novel scheme that allows us to approximate any integral (for which a Gibbs sampler exists) in a parallel fashion with no synchronization or locking, avoiding the typical performance bottlenecks of parallel algorithms. We provide three examples that offer numerical evidence of the scheme’s convergence and illustrate some of the algorithm’s properties with respect to scaling. Because our hardware resources are bounded, we have not yet found a limit to the algorithm’s scaling, and thus its true capabilities remain unknown. The convergence proof for our scheme is a work in progress and we defer it to a future publication.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Distributed Matrix Factorization using Asynchrounous Communication

Using the matrix factorization technique in machine learning is very common mainly in areas like recommender systems. Despite its high prediction accuracy and its ability to avoid over-fitting of the data, the Bayesian Probabilistic Matrix Factorization algorithm (BPMF) has not been widely used on large scale data because of the prohibitive cost. In this paper, we propose a distributed high-per...

متن کامل

Techniques for proving Asynchronous Convergence results for Markov Chain Monte Carlo methods

Markov Chain Monte Carlo (MCMC) methods such as Gibbs sampling are finding widespread use in applied statistics and machine learning. These often require significant computational power, and are increasingly being deployed on parallel and distributed systems such as compute clusters. Recent work has proposed running iterative algorithms such as gradient descent and MCMC in parallel asynchronous...

متن کامل

Asynchronous Distributed Learning of Topic Models

Distributed learning is a problem of fundamental interest in machine learning and cognitive science. In this paper, we present asynchronous distributed learning algorithms for two well-known unsupervised learning frameworks: Latent Dirichlet Allocation (LDA) and Hierarchical Dirichlet Processes (HDP). In the proposed approach, the data are distributed across P processors, and processors indepen...

متن کامل

Ensuring Rapid Mixing and Low Bias for Asynchronous Gibbs Sampling

Gibbs sampling is a Markov chain Monte Carlo technique commonly used for estimating marginal distributions. To speed up Gibbs sampling, there has recently been interest in parallelizing it by executing asynchronously. While empirical results suggest that many models can be efficiently sampled asynchronously, traditional Markov chain analysis does not apply to the asynchronous case, and thus asy...

متن کامل

Asynchronous Distributed Estimation of Topic Models for Document Analysis

Given the prevalence of large data sets and the availability of inexpensive parallel computing hardware, there is significant motivation to explore distributed implementations of statistical learning algorithms. In this paper, we present a distributed learning framework for Latent Dirichlet Allocation (LDA), a well-known Bayesian latent variable model for sparse matrices of count data. In the p...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

Asynchronous Distributed Gibbs Sampling (Preprint Version 0.1)

نویسندگان

چکیده

منابع مشابه

Distributed Matrix Factorization using Asynchrounous Communication

Techniques for proving Asynchronous Convergence results for Markov Chain Monte Carlo methods

Asynchronous Distributed Learning of Topic Models

Ensuring Rapid Mixing and Low Bias for Asynchronous Gibbs Sampling

Asynchronous Distributed Estimation of Topic Models for Document Analysis

عنوان ژورنال:

اشتراک گذاری